A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis
نویسندگان
چکیده
Superpositional model of fundamental frequency (F0) contours as suggested by the Fujisaki model can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, improvement of HMM-based speech synthesis is expected by using the merit of superpositional model. In this paper, a targets-based superpositional model is proposed in the light of the Fujisaki model. Here, both accent and phrase components are parameterized by respectively defined low and high targets which allow flexible interaction between accent and phrase components. Due to the flexible interaction, the new method consistently treats such complex F0 movements as low digging, varying declination, and final lowering by simply adjusting parameter values. This facilitates extraction of the model parameters from observed F0 contours, which is one of major problems preventing the use of the Fujisaki model. Extraction of the target parameters is evaluated for a Japanese speech corpus and the F0 contours generated by the model are used for HMM training instead of the original. Listening test of synthetic speech indicates significant improvements in speech quality. Micro-prosodic effects are also investigated. Results show that adding the micro-prosody to the generated F0 contours does not significantly improve speech quality.
منابع مشابه
Superpositional Modeling of Fundamental Frequency Contours for HMM-based Speech Synthesis
Statistical parametric speech synthesis technologies, such as HMM-based and DNN-based ones, gain special attention from researchers because of their ability in generating speech in various voice qualities and styles. In these methods, all acoustic parameters (except durational ones) are handled in a frame-by-frame manner, which is not appropriate for prosodic features. Although relation of adja...
متن کاملImproved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis
Generation process model of fundamental frequency (F0) contours can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, by using the model, improvement of HMM-based speech synthesis is expected. One of major problems preventing the use of the model is that the performance of automatic extraction of the model parameters from observ...
متن کاملUsing FO Contour Generation Process Model for Improved and Flexible Control of Prosodie Features in HMM-based Speech Synthesis
Generation process model of fundamental frequency contours known as Fujisaki's model is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information included in the utterance. Therefore, by controlling fundamental frequency contours in the framework of the generation process model, a mo...
متن کاملCorpus-Based Hidden Markov Modelling of the Fundamental Frequency of Lithuanian
This paper presents the corpus-driven approach in building the computational model of fundamental frequency, or F0, for Lithuanian language. The model was obtained by training the HMM-based speech synthesis system HTS on six hours of speech coming from multiple speakers. Several gender specific models, using different parameters and different contextual factors, were investigated. The models we...
متن کاملSynthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features
Through the analyses of fundamental frequency contours and speech rates of dialogue speech and also of read speech, prosodic rules were derived for the synthesis of spoken dialogue. As for the fundamental frequency contours, they were rst decomposed into phrase and accent components based on the superpositional model, and then their command magnitudes/amplitudes were analyzed by the method of m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013